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Associated with each HTML page file, there will normally be image files, (and increasingly 
also sound and video files) which will be automatically loaded into the client with the page 
file. Furthermore, a page may be divided up into a plurality of frames, as definedby a 
frame definition file, into each of which content files can be independently loaded. 

It is well known to collect usage data for a website (such as the website 10 of Figure 1) by 
noting each time each individual page of 1he site is requested (often called a "hit") during 
the course of a day. Such data may then be analysed to produce basic statistical data such 
as the number of overall hits on the website by day/month/multiple months, and the 
10 number of hits for each page by day/month/multiple months. Collecting addition data 
associated with each hit (file request) can provide further useful data - for example, noting ; 
the origin of each file request permits the identification of the most productive "portal" 
providing a hyperlink to the website. 

IS Another useful type of information that can be collected is the behaviour and preferences 
of users. The collection of this type of information requires each requesting client (or 
associated user) to be identifiable at least during the course of a session of interaction with 
the website. There are several ways of doing this, one of the most well known being the 
use of "cookies" that at the request of the website are stored by clients and supplied back 

20 to the site with every file request; "cookies" permit the usage of the site by individual 
clients to be tracked across multiple sessions of interaction. Another method of tracking 
wd>site usage by individual clients, at least during a single session of interaction, is to 
attach a client identifier to every URL contained in pages served to each client, the 
identifier being allocated when the first page request is received during a session of 

25 interaction; with this arrangement, the identifier is automatically returned by the client with 
every file request (the identifier being stripped off the URL path information before the 
file is retrieved and then added onto every URL in that file as it is downloaded). 

Tracking how particular users navigate a site is useful in determining which groups of 
30 topics are of common interest to particular groups of users; this is not only of interest for 
customer behaviour analysis on commercial sites but also permits a degree of predictive 
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Because the HTTP protocol is a stateless protocol, it is possible for a web client to 
cease to be interested in a website without the latter being aware that the client has 
moved on; in this case, it would be incorrect to continue to consider that client as 
having a current monitored entity on the website. In order to minimise this possible 
5 source of error, the "current" status of a monitored entity associated with a particular 
client is cancelled when the time elapsed since a request from that client has exceeded a 
predetermined cut-off value. In fact, the website may be provided with an indication 
that a particular client has ceased to be interested in the site (for example, through a 
log-off procedure or by ensuring that the site is involved whenever an off-site link is 
10 activated from one of its own pages); in such cases, this indication is used to ensure that 
there are no "current" monitored entities associated with the client concerned. 

In one embodiment, the monitored entities are individual files corresponding to 
respective pages of the website. In this case, the currency information can comprise, for 
1 5 each client, a client data item including an indication of the last preceding page file 
requested by that client; step (c) then involves determining whether the last preceding 
page file is a monitored entity. 

In another embodiment, at least one monitored entity is defined in terms of a 
20 combination of a particular frame-definition file and a predetermined file serving as a 
source file for a frame defined by the frame-definition file. In this case, the currency 
information can comprise, for each client, a client data item including a list of the last 
preceding files requested by that client; step (c) then involves determining from the list 
whether said at least one monitored entity is current for that client which is taken to be 
25 so when both the particular frame definition file and the predetermined file are current 

In a further embodiment, at least one monitored entity is defined in terms of a 
sequential combination of first and second predetermined files in that order. In this 
case, the currency information can comprise, for each client, a client data item 
30 including a list of the last preceding files requested by that client; step (c) then involves 
determining from the list whether said at least one monitored entity is current for that 
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is a diagram showing a known website arrangement to which the present 
invention can be applied; 

is a diagram showing the properties and methods of a database object used 
in a first embodiment of the website usage monitoring method of the 
invention; 

is a diagram of a first form of output produced by the Figure 2 embodiment; 
is a diagram of a second form of output produced by the Figure 2 
embodiment; 

is a diagram showing the properties and methods of a database object used 
in a second embodiment of the website usage monitoring method of the 
invention; and 

is a diagram showing an analysis method used in the Figure 5 embodiment 
Best Mode of Carrying Out the Invention 

Figure 2 illustrates the application of a first embodiment of the invention for 
monitoring usage of the website 10 of Figure 1; in the present example, the monitoring 
" method is arranged to determine the current distribution of clients across all the pages 
of the site. 

In the Figure 2 arrangement, the HTTP server 12 is set up to identify with a client ID 
each client visiting the site during a current session of interaction- examples of how this 
may be achieved have already been described above with reference to 
any other suitable may also be used. 

Each time a page is requested by a client, the server 12 is configured to output a 
message 22 containing both the client ID of the requesting client and an ID for the 
requested page. The messages 22 are passed to a database object 21 that forms the main 
element for implementing the first embodiment of the present invention. 
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preferably in a graphical form. Figure 3 illustrates one form of output in which the 
analysis method has represented the counts by iconic people placed in a site structure 
diagram showing the logical organisation of the site pages; in this case, each people 
icon represents, for example, 100 clients. Thus, for the Figure 3 example, there are 
5 about 400 clients currently visiting page P3 whilst only around 50 clients are currently 
visiting page P5. 

Figure 4 shows another possible form of output from the analysis method; in this 
example, only the top five most currently-visited pages are depicted, this time in 
10 histogram form and in descending order of popularity. 

When a client ceases to be interested in the website 10, mere will be no indication sent 
to the site of this unless special mechanisms have been built into the site. Where such 
mechanisms have not been provided, it is preferable to treat the page last-requested by a 

15 client as "current" (or the client as "current" to the site) if the page was requested within 
a predetermined cut-off time limit. This can be achieved in the case of the Figure 2 
embodiment by time-stamping the messages 22 and holding the timestamp in the 
corresponding client entry along with the last-requested page ID; when the analysis 
method 29 is run, any entry having a timestamp older than the cut-offperiod is then 

20 ignored (and preferably deleted). In fact, the same effect can be achieved without the 
use of timestamps simply by re-initialising the table 23 after each analysis and then 
collecting data for a period equal to the desired cut-offperiod before carrying out 
another analysis. However, this limits the frequency of analysis and it is preferable to 
use a timestamp to eliminate entries that are too old. 

25 ■ ; ■". ' 

In a variant of the Figure 2 embodiment, the analysis is effected on an on-going basis 
by having the add and update methods 24, 25 increment and decrement client counts for 
each page as appropriate. In mis case, the client counts for each page are kept all the 
time; at each execution of the add method 24, the count for the page concerned is 

30 increment whereas at each execution of the update method 25, the count for the page 
being superseded is decremented and the count for the new page is incremented. The 
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entities each constituted by a respective file loaded in a different one of the frames of a 
multi-frame page). 

As in the Figure 2 embodiment, in the Figure 5 embodiment, the server 12 is arranged 
5 to pass a message 52 to a database object 51 each time a client requests a website file. 
In the present example, the message 52 contains not only the ID of the requesting client 
and of the requested file, but also a time stamp. The database object 5 1 includes a table 
53 that is similar to the table 23 in that it includes an entry for each current client; 
however, rather than each entry simply recording the ID of the most-recently requested 
1 0 page, because of the more complex types of monitored entity to be handled each entry 
keeps a history list of at least the last several files requested including an indication of 
the frame and/or window in which each file is displayed. Each table entry also stores 
the timestamp associated with the file most recently requested by the relevant client. 

15 As for the database object 21 of Figure 2, the database object 51 of Figure 5 has add, 
update and analysis methods 54, 55 and 59. However, in addition database object 5 1 
has delete and purge methods 56 and 57 . The delete method when invoked simply 
deletes an entry identified by client ID from the table 51. The purge method 57 when 
invoked scans the table 57 and deletes entries that have not been updated more recently 

20 than a cut-off time. 

In operation, when a message 52 is passed to the database object it is handled by the 
update and add methods 55, 54 in the same manner as effected by the corresponding 
methods of Figure 2 with the exception that now the file ID information is added to the 

25 history list for the client rather than superseding the previous file ID. Since with pages 
employing frames it is necessary to know into which frame a requested file is to be 
loaded in order to generate a history list properly representing the evolution of the 
client's view of the site, the update and add methods must know something about the 
structure of the site and its pages. Accordingly, a structure table 58 is provided. When 

30 the update or add method adds a file to the history list of a client entry in table 53, it 
first ascertains from this structure table where the file is to be added into the history list 
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entity is present and current in all its elements (step 70). If the monitored entity is 
found to be present and current, then its count is incremented (step 68) before passing 
to step 66; otherwise step 66 is proceeded to directly. 

5 The output step 71 involves generating a graphic output (for example of the form 

illustrated in Figure 3 or 4) on the basis of the counts derived for the monitored entities. 

With regard to the delete method 56, this is called whenever the web server 12 receives 
a positive indication that a client has ceased to be interested in the site (for example, 
10 through execution of a logoff sequence or by following an offsite link where activation 
of the latter causes the site to be notified). On detecting such an indication, the server 
12 sends an appropriate message to the database object 52 which passes it to the delete 
method to cause the corresponding entry in the table 53 to. be deleted. 

15 

It will be appreciated that many variants are possible to me above-described 
embodiments of the present invention. For example, with respect to the Figure 5 
embodiment, it is possible to dispense with the need for the structure table 58 by the 
expedient of having the server 12 tag every site-related URL in every file b eing sent to 
20 a client, with the name of the target window or frame for the file identified by the URL. 
When the URL is returned with a file request to the server 12, it strips off the target 
window/frame name and passes it to the database object 51 as part of the message 5 
thereby avoiding the need for the add/update methods to ascertain this information 
themselves. 

25 . '• '- 

Of course, the current distribution information can be made available not only to the 
site system administrator but also to clients (end users) to indicate to them the parts 
the site found by others to be of the most interest. 
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5. A method according to claim 4, wherein said end-of-interaction indication is at 
least one of the following: 

- a file request sent by the client to the website for a file not located at the website; 

— a user-initiated session termination message; 
5 — a connection temunation message. 

6. A method according to claim 1 , wherein said monitored entities are individual files 
corresponding to respective pages of the website, said currency information comprising 
for each said client a client data item including an indication of the last preceding page 

10 file requested by that client, and step (c) involving determining whether said last 
preceding page file is a monitored entity. 

7. A method according to claim 1 , wherein at least one said monitored entity is defined 
in terms of a combination of a particular frame-definition file and a predetermined file 

15 serving as a source file for a frame defined by said frame-definition file, said currency 
information comprising for each said client a client data item including a list of the last 
preceding files requested by that client, and step (c) involving determining from said 
list whether said at least one monitored entity is current for that client, said at least one 
monitored entity being treated as current when both said particular frame definition file 

20 and said predetermined file are current 

8. A method according to claim 1, wherein at least one said monitored entity is defined 
in terms of a sequential combination of first and second predetennined files in that 
order, said currency information comprising for each said client a client data item 

25 mcluding alistof me last precedmg files requested by that cUent, and step (^^ 
involving determining from said list whether said at least one monitored entity is 
current for that client, said at least one monitored entity being treated as current when 
said first predeterrnined file has been superseded by said second file and the latter is 
current. 
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9. A method according to any one of claims 6 to 8, wherein each said client data item 
includes a timestamp indicating when the last file request was handled for the client 
concerned, the method involving periodically purging the currency information of all 
client data items having a timestamp older than a predetermined cut-off time. 



10. A method according to claim 9, wherein step (c) is carried out repeatedly with said 
particular point in time for each repetition of step (c) being the point in time when that 
repetition is effected, said purging being effected at the commencement of each 

10 repetition of step (c). 

11. A method according to claim 1, wherein said output generated in step (c) takes the 
form of a graphical display of the structure of the website including representations of 
said monitored entities visually indicating the relative magnitudes ofthe number of 

15 clients currently associated with each entity. 

12. A method of monitoring the usage of a website involving the steps of: 

- bating ^identifiers 
provided to the site by the client with each page request from that client; 

- at each request by a client for a page ofthe website, at least where that page is 
different from a page currently being browsed by the client,: 

- generatingand storing a current-presence 

client, as represented by the client's identifier, is currently browsing that 

page, and 

" ? maviB * any Prior current-presence indication for that client indicating the 
client's presence at a different page, 
• generating from said current-preserve mdicatioiis an outout m^^ 
distribution of clientsacross the pages of the website. 




Application No: 
Claims searched: 



Patent 
Office 



GB 9901857.4 
1-12 



Examiner: 
Date of search: 



it 



INVESTOR IN PEOPLE 

Geoffrey Western 
24 August 1999 



Patents Act 1977 

Search Report under Section 17 

Databases searched: 



UK Patent Office collections, including GB, EP, WO & US patent specifications, in: 
UK CI (Ed.Q): G4AAFMDAFMG 
Lit CI (Ed.6): GQ6F 11/30 11/34 17/30 
Other: Online: EPODQC, JAPIO, TDB, WPI. INTERNET 



Documents considered to be relevant: 



Category 


Identity of document and relevant passage 


Relevant 


A,E 


EP 0909082 Al (LUCENT TECHNOLOGIES) 




A 


WO 98/26571 Al (AT&T CORP) 




X 


LogDoor Web Site Monitor, 16 June 1997 press release at 
vww2.opendoor.c»rn/logdoor/windows/ldwships.htm 


1 and 12 
at least 


X 


WebXRay review by Rick Broida, Computer Shopper, August 1997 at 

>^siv^g://29/http://w^.2dnetcom/products/content/ 
cshp/1 708/cshp01 88.htm . . 


land 12 
at least 


X 


Internet Snapshot review by Rick Broida, Computer Shopper, August 
1997, at wysiwyg://30/http://www.2dnQt.com/products/ 
content/cshD/1 708/cshD01 86.htm 


land 12 
at least 



Y f^r^.^" , ^ be ? tofwvertwe « e P'f * Document published on or. fter the detuml priority d.te but Wo« 

the. filing date of this invention. 
E Patent document published on or after, but with priority date earlier 
than, the filing date of this application. 



with one or more other documents of same category, 
A Member of the tame patent family 



An 



Executive Agency of the Department of Trade and Industry 



